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We develop a transfer matrix formalism to visualize the framing of discrete piecewise linear curves 
in three dimensional space. Our approach is based on the concept of an intrinsically discrete curve, 
which enables us to more effectively describe curves that in the limit where the length of line 
segments vanishes approach fractal structures in lieu of continuous curves. We verify that in the case 
of differentiable curves the continuum limit of our discrete equation does reproduce the generalized 
Frenet equation. As an application we consider folded proteins, their Hausdorff dimension is known 
to be fractal. We explain how to employ the orientation of Cj3 carbons of amino acids along a protein 
backbone to introduce a preferred framing along the backbone. By analyzing the experimentally 
resolved fold geometries in the Protein Data Bank we observe that this framing relates intimately 
to the discrete Frenet framing. We also explain how inflection points can be located in the loops, 
and clarify their distinctive role in determining the loop structure of foldel proteins. 

I. I: INTRODUCTION 

The visualization of a three dimensional discrete framed curve is an important and widely studied topic in computer 
graphics, from the association of ribbons and tubes to the determination of camera gaze directions along trajectories. 
Potential applications range from aircraft and robot kinematics to stereo reconstruction and virtual reality [IJ, 0. 

We are interested in addressing the problem of characterizing the physical laws that govern protein folding. For 
this we develop a technique for framing a general discrete and piecewise linear curve in a manner that will eventually 
enable us to combine the geometric problem of framing with an appropriate physical principle for frame determination. 
Our ultimate goal is to have an approach, where instead of purely geometric considerations the frames along a curve 
are determined directly from the properties of an underlying physical system. As a consequence we expect that our 
formalism and our results will find wide applicability well beyond the protein folding problem. 

The classical theory of continuous curves in three dimensional space employs the Frenet equation ^ , to determine 
a moving coordinate frame along a sufficiently differentiable space curve. However, if the curve has inflection points 
and/or straight segments or if it fails to be at least three times continuously differentiable, the Frenet frame becomes 
either discontinuous or may not even exist. In such cases there can be good reasons to consider the option to introduce 
an alternative framing such as Bishop's parallel transport frame [3 , a geodetic reference frame or some possibly hybrid 
variants [l], [2]. 

In this article we derive a discrete version of the Frenet equation that introduces a framing along an intrinsically 
discrete and piecewise linear curve in R^. We develop the general formalism for the visualization of such a curve 
without any underlying assumption that it approaches a continuous space curve in the limit where the maximum 
length of its line segments goes to zero. The continuum limit may as well be a fractal, with a nontrivial Hausdorff 
dimension. Thus, unlike in several approaches that we are aware of, our starting point is not in a discretization of 
the continuum Frenet equation. Instead our approach is intrinsically discrete, and it is based on the transfer matrix 
formalism that is widely used for example in lattice field theories [4]. Indeed, we find it useful to adapt some notions 
of lattice gauge theories [4 . For us this provides a valuable conceptual point of view. Moreover, since the transfer 
matrix formalism intrinsically incorporates self-similarity and the very concept of line segment length has no role in 
our derivations, we can effortlessly consider curves that have fractal continuum limits while at the same time ensuring 
that if the continuum limit exists as a class space curve we recover the standard Frenet framing together with its 
generalized versions. 

As an application we consider folded proteins, for which the continuum limit is known to be a fractal with Hausdorff 
dimension that is very close to three [5 . The locations of the central carbon atoms along the protein determines 
a discrete piecewise linear curve, this is the protein backbone. We introduce a framing to the backbone by employing 
the Cj3 carbon atoms of the side chain amino acids that are covalently bonded to the Cc^ carbons that define the 
backbone. The frame at the location of a given carbon is determined by the directional vector that connects it 
with the ensuing carbon, together with the directional vector that connects it to the next carbon along the 
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backbone. By inspecting the framing of all protein structures in the Protein Data Bank (PDB) [10] we find that such 
a C/3 framing relates intimately to the discrete Frenet framing of the backbone. In particular, we conclude that for a 
folded protein the concept of an inflection point acquires an intrinsic biological interpretation, it coincides with the 
location of the center of the loop: The inflection points drive the protein loop geometry. 

At an isolated inflection point of a continuous curve, the curvature which is a frame independent geometric charac- 
teristic of the curve vanishes. At such a point the Frenet frame can become discontinuous (see Figure 1). Consequently 




FIG. 1: A curve with inflection point (yellow ball). At each point the direction of the (Frenet frame) normal vectors (green) is 
towards the center of an oscullating circle. There is a discontinuity in the direction of the normal vectors when we traverse the 
inflection point. At this point the radius of the oscullating circle diverges and the normal vector n becomes abruptly reflected 
in the oscullating plane from one side to the other side of the curve. The blue vector equals opposite of the (reflected) normal 
vector n (see also Figure 6). 



a single non-degenerate inflection point can not be removed by any local continuous deformation of the curve. An 
isolated non-degenerate inflection point can only be locally and continuously removed in the presence of another in- 
flection point, by deforming the curve so that the inflection points annihilate each other in a saddle-node bifurcation. 
In particular a sole non-degenerate inflection point can be removed only by translating it away through an endpoint 
of the curve which involves a global deformation of the curve. This kind of stability enjoyed by an isolated inflection 
point under local deformations of the curve is the hallmark of a topological soliton. Indeed, let us recall the topological 
kink-soliton in a quartic double-well potential [9] 
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y{s) = c • tanh[m(s — sq)] (1) 

It describes a trajectory that interpolates between the two minima y = ±c of the potential V{s); See Figure 2. The 
center of the soliton is at the point s = sq where y{s) vanishes. The influence of this center point to the global 
topology of the trajectory can not be removed by any kind of continuous local deformation y{s) y{s) -\-Sy{s)^ as the 
resulting curve continues to retain its characteristic global property that ^ ±c as s ^ ±oo. Thus the deformed y{s) 
necessarily vanishes at least at one point. The goal of the present paper is to explain how this signature behaviour of 
a topological soliton can be detected and described in the case of discrete piecewise linear curves, and in particular 
those curves that relate to the framing of folded proteins. 



II. II: THE GENERALIZED FRENET FRAME AND INFLECTION POINTS 



A. A: The Generalized Frenet Frame 



We start by describing the continuum Frenet equation and its generalizations. Let x(s) be a space curve in R^. Its 
unit tangent vector 

t - X - ^ ^^^^"^ 

||x|| ||x|| ds 
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FIG. 2: The kink-soliton (right) interpolates between the two ground states at = zbc of the potential (left) as s ^ zLcxd. It 
is topologically stable and can not be removed by any finite energy deformation. 



(we assume that ||x|| ^ 0) is subject to the Frenet equation [T], [2] 




(2) 



where 



is the unit binormal vector and 



is the unit normal vector of the curve, and 



XXX 

|x X x| 



n = b X t 



, . XXX 



is the frame independent curvature of x(5) and 

(x X X) ■ X 

||XXX||2 

is the torsion. The three vectors (n, b, t) form the right-handed orthonormal Frenet frame at each point of the curve. 

In the following we shall assume with no loss of generality, that 5 G [0, L] measures the proper length along a curve 
with total length L in so that 

||x||=l (3) 

Consider a curve with an isolated non-degenerate inflection point (or more generally a straight segment) such as the 
one depicted in Figure 1. At the inflection point s = sq the Frenet frame can not be introduced since hz{so) vanishes; 
in the proper length gauge 

K.{so) = ||x(so)|| = 
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Conventionally, see e.g. [6], in the presence of inflection points the Frenet equation ([2| is usually introduced only 
piecewise between the inflection points, for those values of s for which hc^s) is nonvanishing. But there are also 
alternative approaches that allow for a continuous passage of the frame through the inflection point (more generally 
straight segments). For this we view the Frenet frame as an example of a general frame, obtained by starting from 
the observation that while the tangent vector t{s) for a given curve is unique, instead of {11(5), b(5)} we may choose 
an arbitrary orthogonal basis {61(5), 62(5)} for the normal planes of the curve that are perpendicular to 1(5), without 
deforming the curve. This general frame is related to the Frenet frame by a local SO (2) frame rotation around the 
frame independent tangent vector t{s) (see Figure 3), 




FIG. 3: The (blue) Frenet frame (n, b) and a generic (green) orthogonal frame (ei,e2) on the normal plane of t, the tangent 
vector of the curve. 



by v^2y lsin7^(s) cos 77(5) J lb 



The ensuing rotated version of the Frenet equation is 



n cos T] hzs'mr] 
If we recall the adjoint basis of SO (3) Lie- algebra 






= -1 = = 1 (6) 




(4) 



{t — if) —tvCOST]^ 
-{r-fj) -hcsinr] \ \ ^2 \ (5) 



where 

we flnd that on r and Hi the SO {2) transformation acts as follows, 

r ^ r - rj (7) 



nT'^ ^ /^(T^ cos 77 -T^sinr/) = e"^^' {kT'^) e'"^^' (8) 
If instead of 7^ = that specifles the Frenet frame (Frenet gauge) we select r]{s) so that 

r]{s) = [ T{s')ds' 



we arrive at Bishop's parallel transport frame [3]; [T], [2] that can be deflned continuously and unambiguously through 
inflection points. We note that ([7|, (|8| can be interpreted in terms of a 5*0(2) gauge multiplet The change ^ in 
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r{s) is identical to the S0{2) :^ U{1) gauge transformation of a one-dimensional gauge vector while K:{s) transforms 
like a component of a 5*0 (2) scalar doublet. This leads us to a gauge invariant quantity, the complex valued Hashimoto 
variable [8] 



(9) 



When we combine ^ with a SO (2) C 5*0 (3) rotation ^ by r]{s) around the T^-direction of the 5*0 (3) Lie algebra, 
the effect on ([9| can be summarized as follows. 



as) 



irj{s) 



exp J T ds' ^ ir]{s) 



(10) 



and thus the Hasimoto variable ^(s) is manifestly independent of r]{s). (Note however, that the 7^(0) dependence 
remains as an overall global phase ambiguity which is inherent to (10) - the local gauge invariance becomes eliminated 
but a global one remains.) In fact, the Hasimoto variable simply combines the two real components of the S0{2) 
scalar doublet into a single complex valued variable, with modulus that equals the frame independent a.k.a. gauge 
invariant geometric curvature of the curve. In particular the Frenet frame is like the widely used "Unitary Gauge" in 
the Abehan Higgs Model [7. 

We find this language of gauge transformations in connection of frame rotations introduced in ^ to be intuitively 
appealing and beneficial, and we shall use it frequently in the sequel. 



B. B: Inflection Points 



We proceed to consider a continuous curve with n infiection points at s = Si, 

So = 0< ... < Si < Si^i < ... < L = 

For simplicity we assume that the infiection points are isolated and non-degenerate zeroes of the curvature 

i^{si) = 

A generalization to more involved infiection points is straightforward. We take the curve to be of class C^. This 
ensures that at each segment (5^,5^+1) the curvature is of class C^. Furthermore, since the infiection points are non- 
degenerate, as we approach an infiection point the left and right derivatives of the curvature are non- vanishing and 
in the limit when s ^ Si they become equal in magnitude but have an opposite sign, 

df<i{s) di<i{s) ^ ^ 

ds \sf ds \s~ 

This jump in the derivative of the curvature is the signature of an infiection point in the Frenet frame. But even 
though the curvature /^(s) fails to be continuously differentiable the signed curvature 

n 

= ^{-iy<s)e{s-s,)0{s,^i-s) (11) 

i=0 

with 6{s) the unit step-function 

= |o . < 

is now continuously differentiable for all 5 G [0, L] and and in particular 

ds\s. 

The original Frenet curvature tv{s) and the signed curvature k{s) are related by a gauge transformation ([8| of the 
Frenet frame, with 77(5) given by the following gauge transformation ^ of the Frenet torsion 

J n—l n—1 

t{s) ^ r{s)-f,{s) = T{s)-7r--J20{s-Si) = r(s) - tt ^ - s,) (12) 

i=l i=l 
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This can be immediately verified by comparing the form of (11) with that of the Hashimoto variable ([9|), ( |1Q[ ). We 
may call this gauge transformed version of the Frenet frame the Z2-Frenet frame, its discrete version will become 
important to us when we consider applications to folded proteins. 

For a concrete example we take the plane curve in Figure 1. For this curve, in the vicinity of the inflection point 
the Frenet curvature has clearly a qualitative form that may be described by the absolute value of the kink-soliton 
profile ([l]), 

hz{s) ~ /t:o |tanh[m(5 — 5o)]| 

Obviously the derivative of this curvature is discontinuous with a finite jump at the inflection point / where s = sq. 
This discontinuity reflects itself in the abrupt change in the direction of the (green) normal vector n, as depicted in 
Figure 1. The ensuing signed curvature (11) is qualitatively described by the kink-soliton ([T]) 



k{s) ^ hzo tdinh[m{s — So)] (13) 

and it is manifestly continuously differentiable, including the point s = sq. Now the direction of the corresponding 
normal vector is also continuous through the inflection point. This is because the change in its direction becomes 
compensated by the change in the sign of the signed curvature when we cross the inflection point; see the blue vectors 
in Figure 1, and Figure 6. 

III. Ill: THE DISCRETE FRENET EQUATION 

A. A: The Discrete Frenet Frame 

In the sequel we are primarily interested in an open and oriented, piecewise linear discrete curve that we describe 
by a three- vector t{s) G M^. The parameter s G [0,L] measures the arc length and L is the total length of the curve. 
The curve is determined by its vertices Ci that are located at the positions = (fq, . . . ,r^) with r(5^) = r^. The 
endpoints of the curve are at r(0) = tq and t{L) = r^. The nearest neighbor vertices Ci and Q+i are connected by 
the line segments 

S - Si S - Si^i 

r(s) r^+i - Ti 

Si^l - Si Si^i - Si 

where Si < s < We utilize the Galilean invariance to translate the base of the curve to the origin in so that 

ro = 

The remaining global rotational orientation of the curve can then be fully determined by the choice of ri and r2. 
For each pair of nearest neighbor vertices r^+i and along the curve we introduce the unit tangent vector 

t, = p^^, (14) 

If all tangent vectors are known, the position of the k^^ vertex is given by 

k-l 

r/e = ^ Ir^+i -Til-ti (15) 

i=0 

We now introduce the discrete Frenet frame (DF frame) at the vertex Ci at r^. This can be done whenever the 
three vertices at r^+i, and r^-i are not located on a common line so that ti and t^-i are not parallel. This enables 
us to determine the unit binormal vector 

and the unit normal vector 

ni = hiX ti (17) 

The orthogonal triplet (ni, bi, ti) constitutes the discrete Frenet frame (DF frame) for the curve at the position of the 
vertex ri for each i = (1, n — 1), see Figure 4. 



FIG. 4: A discrete piecewise linear curve is defined by its vertices d and at each vertex there is an ort ho normal discrete Frenet 
frame (t^,n^,b^), provided t^-i and ti are not parallel. 



B. B: The Transfer Matrix 



We now proceed to derive a discretized version of the Frenet equation (DF equation) that relates the discrete Frenet 
frame at vertex Ci to the discrete Frenet frame at vertex Q+i and allows for the construction of the curve in terms 
of the appropriate discrete versions of the curvature n{s) and torsion t{s). 

From general considerations [4 we conclude that the DF equation should involve a transfer matrix IZi-^i^i that 
maps the DF frame at the vertex i to the DF frame at the vertex i + 1, 




(18) 



The construction of this transfer matrix then amounts to a solution of the DF equation: 



'^n,n— 1 ' '^n— l,n— 2 ' ••• 



SO that once the transfer matrix is known for all i = 1, ...,n — 1, we can use (18) to construct all the Frenet frames 



for i = 2, ....,n and the entire curve r{s) using (15) together with the fact that the curve is linear in the intervals 
Si-i < s < Si] We recall that for the initial conditions we need to specify tq that we have already chosen to coincide 
with the origin tq = 0, and ri and r2 that remove the degeneracy under global SO (3) rotations of the curve in R^. 

The transfer matrix 7^i+i,i is an element of the adjoint representation of 6'0(3), thus we can parametrize it in terms 
of Euler angles. We choose the {zxz) angles 



sin tjj sin <p + cos cos t/j cos (j) sin cos ip — sin cos <p — cos cos sin 
'JZi_^i^i = I — sin^cos^ cosO sin6>sin0 

cos t/j sin (j) + cos sin ip cos <p sin sin ip cos cos <p — cos sin ip sin 




(19) 



Here the angular variables have the following ranges: For the inclination angle 6 we take 6 G [0, tt] mod(27r) and for 
the two azimuthal angles we choose <p G [— 7r,7r] mod(27r) and tjj G [— 7r,7r] mod(27r). Note that since the angular 
variables are elements of the transfer matrix that takes the discrete Frenet frame from the vertex i to the vertex i + 1, 
they are all to be interpreted as link variables that are defined on the bonds connecting the vertices. 
From ( 16 ) we get the following condition 



bi+i 







Thus for each bond (z, z + 1) 



sin sin <p = 



8 



and we conclude from (14)-(17) that for all i we must have 

= 

This simplifies the discrete Frenet equation into 

cos ip cos cos ip sin — sin ip^ 

sin cos 
sin cos sin tp sin cos 







1 b ^ 1 


-{ 


[CI 






7^. 



Here 



is the discrete bond angle and 



COStpi^l^i = t^+1 • ti 

cos(9i+i,i = bi+i • hi 



(20) 

(21) 
(22) 



is the discrete torsion angle. Geometrically, the bond angle measures the angle between t^+i and ti around b^+i 

on the plane that is determined by the three vertices (Q, Q+i, Ci-^2) (Figure 5). The torsion angle measures the 

angle between the two planes that are determined by the vertices (Ci_i, Ci, C^+i) and ((7^,(7^+1,(7^+2), respectively 
(Figure 5). We give these planes an orientation in by extending the range of the torsion angle from 6i-^i^i G [0,7r] 




FIG. 5: The bond angle 'pi-\-i,i is determined by the three vertices ((7i-i, (7i, (7i+i). The torsion angle is the angle 

between the two planes determined by vertices (C^-i, d, C^+i) and {d, C^+i, Ci-^2) 

into 6>i+i,i G [— 7r,7r] mod(27r). This introduces a discrete Z2 symmetry 

^2 • ^ — (23) 

that we find useful in the sequel. 
We recall the Rodrigues formula 

e"^ = I + U sin a + (1 - cos a) (24) 

where 

U = u-T = ii^T^ 

and are the SO {3) matrices (|6| and u is a unit vector. With these we can write the transfer matrix as follows, 

Ui^i^i = exp{-ipi^i^iT'^} ■ exp{-Oi^i^iT^} = exp{-av • Tj^+i^^ (25) 

where 

^1 



= 2 arccos 



-(bi+i • bi)(ti+i • ti) 



and 



tp tp tp 
sin — sin - , sin — cos - , cos — sin - 
2 2 ' 2 2 ' 2 2 
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C. C: Gauge symmetries 

Let us consider the effect of the discrete version of the local SO{2) rotation Q, 



e^^^' I b 



(26) 



For the covariance of the DF equation under (26) we need 



A direct computation shows that this imphes the fohowing transformation laws 



V^,+i,,t2 ^ V^,+i,,(t2 cosA,+i -T^ sinA,+i) 

These are the discrete versions of the transformations of r and /t: in ([7|, ([8| respectively. 
Explicitely, the gauge transformed transfer matrix is 



(COS A cos Oa cos ijj + sin A sin 0/\ cos A sin 0/\ cos — sin A cos 0/\ — cos A sin ijj^ 
sin A cos Oa cos — cos A sin Oa sin A sin Oa cos + cos A cos Oa — sin A sin ip 
cosOa simp sin6>A simp cosip 



We have here used the notation 



A 

Oa 



+ Ai 



(27) 
(28) 

(29) 
(30) 

(31) 
(32) 

(33) 



and the corresponding general frame Prenet equation is 



(34) 



Notice that even though the explicit matrix elements in 
link variables, the gauge transformed transfer matrix (31^ 



( 32 ) do not have a manifestly covariant form in terms of the 
is by construction a covariant link variable. 



D. D: Continuum Limit 



The different choices of A^ in (34) correspond to different generalized Frenet frames. We shall now verify that with 
the general version of transfer matrix (32), this indeed yields the generalized Frenet equation ([5| in the continuum 
limit where the distances between the vertices Ci of the curve vanish, provided the limit is a class curve. 



We define 



A^+i - A, 
^(A^+i+A^ 



(35) 
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where cTj+j^j are some finite constants. When we expand (34) in e we get in the leading order 



















1 -1 




1 " 








i+1 




i - 





(r — Cr) —hZCOST]'^ 

-(r — a) —tvsinr] 
K, cos T] K, sin T] 



(36) 



If the e ^ exists it gives us the generahzed continuum Frenet equation ([5|, with the identification 



and the identification (35) between the discrete torsion and curvature angles with their continuum counterparts. 



E. E: Inflection points 

Consider a piecewise linear curve that has a single isolated infiection point located at vertex C^; A generalization to 
several infiection points and straight segments is straightforward. By assumption, the preceding vertex Ci-i admits 
a Frenet frame. Since the tangent vectors t^ and t^_i are parallel, at the vertex Ci both the normal vector and 
the binormal vector of a Frenet frame can not be determined and the Frenet frame at Ci can not be introduced. 
Consequently the torsion angle can not be defined. But the definition of the bond angle involves only the 



tangent vectors so it can still be computed and from (21) we get 

= (mod 27r) 



In order to introduce a framing of the curve that covers the vertex we proceed as follows: We first deform the 
curve slightly by moving the vertex Ci in a direction of some arbitrarily chosen vector u that is not parallel with t^. 



e • u 



(37) 



Here the limit e ^ is tacitly understood. The introduction of u removes the infiection point from the shifted vertex 
Ci and this enables us to introduce a u dependent Frenet frame at the shifted vertex Ci. In the limit where e vanishes 
we get a u dependent frame at the original vertex obtained by transferring the Frenet frame from the vertex Ci-i 
as follows, 



- smf 









1 1 













(38) 



Here Oi^i-i is now some description i.e. explicitely u dependent angle. 

In order to establish that the frame can be chosen in a u independent manner we proceed to remove the explicit u 
dependence. For this we introduce the gauge transformation (29) in (38) which sends 



Since we have the original Frenet frame at the vertex Q-i, we also have 

\-i = 

But is freely at our disposal and we may choose it so that any u dependence becomes removed. This leaves us 
with a u independent reminder that we may choose at our convenience. 



A, 



A,. 



where Ai^i_i is now by construction a u independent quantity, at our disposal. Different choices correspond to 
different gauges. 

Since t^ and t^+i are not parallel, we can proceed to construct a frame at vertex Q+i from the frame (ei,e2,t)^ at 
vertex Ci using the transfer matrix (32). Since the remaining gauge parameters A^ with k > i are all at our disposal. 



we may return to the Frenet frame, or select any other convenient framing, at the vertex Q+i and at all subsequent 
vertices. If the goal is to approximate a continuous space curve, in the limit of vanishing bond length the gauge 
parameters A/^ should be selected in such a manner that in the continuum limit they yield the gauge function r]{s) 
and so that the ensuing discrete transfer matrix smoothly goes over to its continuum limit ( 36 ) 



11 



F. F: Discrete gauge transformations 



The transfer matrix 7^i+i,i determines the curve in up to rigid Gahlean motions i.e. global translations and 
spatial rotations. The improper spatial rotation group 0(3) acts on each of the vertices in (15) by a rotation 
matrix O G 0(3) that sends each of the Vk into 

As a consequence only the global orientation of the curve in changes. An example is the improper rotation that 
inverts the curve in R^ by reversing the direction of each tangent vector 

ti -ti 



but with no effect on the and b^. From the explicit form of the transfer matrix in (20) we conclude that this 
corresponds to the following global version of ( 29 ) , ([3Q|) 



That is, Ai = 7T for all i. Consequently if we include this improper rotation in our gauge structure we can restrict the 
range of ipi from ipi G [— 7r,7r] mod(27r) to ipi G [0,7r] mod(27r), but we prefer to continue with the extended range. 
Similarly, we can introduce the improper rotation that sends 



with no effect on t^ and n^. Since the ti remain intact the curve does not change, and from the DF equation (20) we 
conclude that this corresponds to the following global Z2 transformation 

Oi —Oi 



This 

Oi e 



is the Z2 symmetry that we have introduced in (23), to extend the range of Oi from Oi G [0,7r] to 



7r,7r] mod(27r). We note that this symmetry of the underlying curve can not be reproduced by the gauge 
transformation ( [29| ), (30), nevertheless the curve remains intact since the ti do not change. 

Another useful discrete transformation in our subsequent discrete curve analysis is the proper rotation that at a 
given vertex Ci sends 



hi 



-hi 



but with no effect on ti so that the curve remains intact. This rotation is obtained by selecting A^+i = tt and with 
all A/e = at the preceding vertices Ck (with k <i). Since the A^+i appears in the gauge transformation law of both 
OiJ^i^i and ^i+2,i+i5 this leads to the following realization of the gauge transformation (29), (30) 



Oi^i^i — TT 
^i+2,i+l + TT 

If we generalize this gauge transformation by selecting 

A/e = TT for /c > i + 1 

with 

A/e = for < z + 1 
where the vertex Ci is preselected, the gauge transformation becomes 



for all k > 



(39) 



Since the bond angle is the discrete version of the Frenet curvature (35), we recognize here the discrete analog of 



the continuum gauge transformation (11), (12). For a piecewise linear discretization of a plane curve such as the one 
Figure 1, this enables us to introduce a framing that captures the kink-soliton behaviour ([T]), ([l3| of the inflection 
point, with the change of sign in curvature at the soliton position (Figure 6). 
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FIG. 6: A continuous plane curve with an inflection point (yellow dot) such as the one in Figure 1, together with its discrete 
approximation. The tangent vectors ti (red) of the discrete approximation can be chosen so that two neighbors are never 
parallel and thus a discrete Frenet frame can be introduced at each vertex. When we pass through the inflection point the 
direction of the binormal vectors following (A, B, C) becomes reflected in t he plane into (D, E, F) and there is a discontinuity 
in the Frenet framing. But if we introduce the gauge transformation (39) at vertices after the inflection point, the ensuing 
framing (A, B, C, G, H, I) is continuous. 



G. G: Curve Construction 



An example of problems where the present formalism can be applied is the construction of a discrete and piecewise 
linear curve from the known values of its bond and torsion angles. These angles can be constructed for example using 
an energy principle to locate a minimum energy configuration of some energy functional 

We may define the angles using the Frenet frame. Examples of energy functionals have been discussed in [7], [5]. 

Three vertices are needed to specify the position and the overall rotational orientation of the curve. To compute 
a single bond angle from the curve, we need three vertices while for the torsion angle we need four; See Figure 5. 
Consequently from the first three initial positions of the curve, (ro,ri,r2), we can compute the first bond angle V^i^o- 
But in order to compute the first pair (^^2, 1,^2,1) we also need to specify rs. 

Here we are interested in the inverse problem where the set of angles {ipk-\-i,k^^k-\-i,k} are assumed to be known. 
Depending on the boundary conditions for the energy functional, the known initial data may also include numerical 
values of ('^1,05 ^1,0)7 even though ^1^0 lacks a geometric interpretation. In such a case we can immediately proceed 
to the computation of the entire curve using (|20|) or alternatively using the transfer matrix (34), starting from an 



initial choice of frame (ng,bo,to). Different initial choices are related to each other by global i.e. index i independent 
parameter A in (29), (30). 



We get both the frame at the vertex k and its location Vk when we also employ (15), starting from a given initial 
value ro(= 0). 

In general we expect to have a situation where the three first points (ro,ri,r2) are given. From these points we 
get the two tangent vectors to and ti. We then use (16), (17) to complete the Frenet frame at the location ri. We 



identify the bond angle tpi^o with the angle between the two vectors to and ti using (21). This bond angle may or 



may not be determined by the energy functional. If it is determined, the angle between to and ti is determined and 
instead of fully specifying T2 we only need to specify its distance from ri and the remaining directional angle that we 
may call Oi^q. 
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For a practical algorithmic implementation the following choice can be convenient, 

■cosV^i,o^ 
ro = ^1,0 I sinV^i^o 





cos V^i,o 
-sin?/^i,o 




ni 



bi 



(40) 



where we have introduced the notation 

^k+l,k = |r/e+l — r/e| 

for the segment lengths. The generalized Frenet frame together with the corresponding location of the vertex r^+i 
can then be computed by iterative application of 











p 












- 








i 



This can be directly generalized into 






Vo 6 ij . 



(41) 



i+1 







/ei\ 






- 


62 
63 


\- 




i+1 


\r) 


i 






^3 l/, 



i+1 



where 71^ is the matrix ( 32 ) and the , ^2 , ^3 are the components of the vector 



J/c+l,/c 



^cos a sin 
sin a sin /3 

cos/3 



When /3 = (and A = 0) we obtain the transfer matrix (41) with the tangent vector of the curve, while for general 



(a, /3) the tangent of the curve is in the direction of S in the (61,62,63) frame. Thus this transfer matrix provides a 
rule for transporting an a priori arbitrarily oriented orthogonal frame along the curve. 

Of particular interest is the construction of a discrete version of Bishop's parallel transport frame [3 , as a gauge 
transformed version of the discrete Frenet frame. Since the Frenet frame starts with ('^2,1, ^2,1) and can be constructed 
once (ro, ri, r2, ra) are known (unless we introduce ^1^0 which lacks a geometric interpretation), we assume this to 
be the case. The discrete version of Bishop's frame is obtained by gauge transformation from the Frenet frame, by 
demanding that 



02,1 6>2,i + Ai - A2 =0 
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We can freely choose 

Ai = 

as an initial condition, and consequently we arrive at Bishop's frame by selecting 

A2 = 6>2,1 

For A3 we get similarly from 

6>3,2 ^ 6>3,2+A2-A3 = 

that 

A3 = ^2,1 + ^3,2 

and thus the discrete version of Bishop's parallel transport frame is related to the discrete Frenet frame by gauge 
transformations 



k-l 



i=l 



When we substitute this in (32) with ( [33| ), we find that the transfer matrix (32) simplifies into 

1 + cos^ Oa(cos — 1) sin Oa cos Oa(cos?/; — 1) — cos O a sin ?/^^ 
= I sin 6a cos 6a(cos — 1) 1 + sin^ Ba(cos — 1) —sin 6 a sin?/; 



(42) 



COS 6 a sin 



sin 6a sinip 



cos ip 



where now 



6/ 



and with (34), we can construct the discrete version of Bishop's parallel transport frame at each vertex Q. 



IV. IV: FRAMING OF FOLDED PROTEINS 



As an application we utilize the DF equation to investigate the framing of the folded proteins in the Protein Data 
Bank (PDB) [10 . We are particularly interested in the existence and characterization of a preferred framing that 
derives and directly reflects the physical properties of the folded proteins. The identification of such a preferred 
framing, if it exists, should help to pinpoint the physical principles that determine how proteins fold. 

From the PDB we get the three dimensional coordinates of all the different atoms in a folded protein. The overall 
fold geometry is described by the location of the central Ca carbons that determine the protein backbone. We take 
the Ca carbons to be the vertices in a discrete and piecewise linear curve that models the backbone. We then use 
the Ca coordinates to compute the corresponding Frenet framing. For this we first apply (14), (16), (17) to obtain 
the orthonormal basis vectors at each vertex. We then construct the transfer matrices by evaluating the bond and 
torsion angles from (21) and (22). 



A. A: Z2 Frenet framing and solitons 



We start by analyzing in detail an explicit example, the chicken villin headpiece subdomain HP35 (PDB code lYRF 
p^). This is a naturally existing 35-residue protein, with three a-helices separated from each other by two loops. This 
protein continues to be the subject to very extensive studies both experimentally [Il]-[T4] and theoretically [r5]-p!9]. 
We note that the overall resolution in the experimental x-ray data in PDB is 1.07A in RMSD [13]. 

We first compute the backbone Frenet frame bond and torsion angles 6>i+i^i) from the PDB coordinates of 

the HP35 Ca carbons. The result is shown in Figure 7 (left). 

We inquire whether the loop regions contain infiection points. As we have previously explained for example in 
connection of Figure 6, the infiection points can be difficult to identify in terms of the bond angles of the discrete 
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FIG. 7: Left: The Frenet frame bond angle (blue) and torsion angle (red) along the HP35 backbone. In this frame the 
potential presence of an inflection point is only visible in large local variations of torsion angle. Right: The outcome of Z2 
gauge transformations (39) at the loop regions. The result clearly reveals the presence of inflection points, they are located 
between the sites where the (gauge transformed) bond angle changes its sign. This can also be used to dentify the center of 
the loop. Note how closely the profile of the bond angle in the right hand side picture resembles that of the kink-soliton in the 
r.h.s. of Figure 2. 



Frenet framing alone. But as apparent from Figure 6, we can expect that an inflection point is located in the vicinity 
of vertices where the Frenet frame torsion angle is subject to strong local fluctuations. Thus we proceed to inspect 



the data in Figure 7 (left) using the gauge transformation (39) to scrutinize the loop regions where the Frenet torsion 



angle in Figure 7 (left) is strongly fluctuating. This leads us to a particular version of the Z2 Frenet frame, with bond 
and torsion angles as in Figure 7 (right). 

By comparing the bond angles in Figure 7 (right) with the kink-soliton profile in the right hand side of Figure 2 
we observe that the bond angles of our gauge transformed frames at each of the loops have assumed the distinctive 
hallmark profile of a (discrete) kink-soliton that interpolates between the adjacent a- helices. In particular, we can 
unambiguously pinpoint the centers of the loops to the locations of the inflection points on the curve: The inflection 
points are between the vertices where the bond angle in our gauge transformed frame changes its sign. 

We have performed a similar analysis to several proteins in the PDB, and some of our results where the techniques 
of the present article are utilized have been reported in [20], [21], [22]. The results are remarkably consistent: In 
every secondary superstructure that we have studied where a loop connects two a- helices and/or /3-strands, after 
appropriate Z2 gauge transformations the profile of the bond angles in the loop can be described with sub- Angstrom 
accuracy in terms of a discrete version of the kink-soliton in Figure 2. The two asymptotic ground states at s = ±c 
in this Figure correspond to the a- helices and/or /3-strands at the ends of the loop. For the a-helices we have the 
Frenet frame values very close to 

« (1.57,0.87) ~ (|,1) 

The P strands can also be interpreted as helices, but in the "collapsed" limit with the approximative values 

« (±1.0,-2.9)- (±l,-7r) 

Consequently it appears that these a- helix//3- strand - loop - a-helix//3- strand superstructures are indeed inflection 
point solitons with the qualitative profile of ([T]) . We remark that a long loop may also consist of a number of inflection 
points i.e. it can be a multi-soliton configuration. 



B. B: Physics based framing 



In every amino acid except glycine, there is a Cfs carbon that is covalently bonded to a carbon. The positioning 
of these Cf^ carbons in relation to their Cq, carbons characterizes the relative orientation of the amino acid side chains 
along the protein backbone, and can be used to introduce a distinctive framing of the backbone; the case of glycine 
can be treated like that of an inflection point. Since the interactions between different amino acids are presumed to 
have a pivotal role both during the folding process and in the stabilization of the native fold, the C/3 framing should 
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be a natural choice to intimately reflect the physical principles that determine the fold geometry of the backbone. 
Consequently one way to try and understand the physical principles that determine how a protein folds, could be 
to investigate the framing along the protein backbone. Here we propose that a practical approach is to look for 



gauge parameters (26) that relate the frames to some purely geometrically determined frames such as the Frenet 
frames or parallel transport frames. The identiflcation of the rules that determine the relevant gauge parameters 
should then provide insight to the physical principles that underlie the protein folding phenomenon. 

The C(3 framing is constructed from the tangent vectors t of the backbone and the unit vectors c that point from 
the Ca carbons towards their carbon. The framing is obtained by Gram-Schmidt algorithm, by flrst introducing 
the unit vector 

t X c 

P 



lltxcll 

and then completing it into an orthonormal frame (t,p,q) at each Ca vertex, where 



q = t X p 



In order to characterize the rules that determine the gauge parameter relating a C(3 frame to the corresponding 
Frenet frame, we have investigated the statistical distribution of the vectors in the PDB proteins in the Frenet 
framing of the backbone. For this we introduce, at each backbone vertex, the inclination angle Xi ^ [0? ^] between 
the tangent vector and the corresponding vector c^, together with the azimuthal angle cpi G [— 7r,7r] between the 
normal vector and the projection of on the (ni,b^) plane; see Figure 8. 




FIG. 8: The definition of the angles Xi cind (fi that describe the location of the Cp carbon with respect to the Frenet frame 
along the Ca backbone. The distance between the Ca and C^ carbons is within the range of 1.56-1.57 A. 



We flrst consider the C/3 framing of the HP35. When we compute the directions of the individual vectors in the 
Frenet frame, we get the result that we display in Figure 9. Remarkably, the directions of the vectors in the Frenet 
frame are relatively site independent. This implies that at least in the case of HP35, the parameters A^ that relate 
the C/3 frame to the Frenet frame can be assigned to a high accuracy a constant and site independent value: The 
physically determined orthonormalized frame appears to differ from the purely geometrically determined Frenet 
frame only by small nutations in the direction of the vectors c in the Frenet frame. We observe that these nutations 
are somewhat smaller in the helix regions that in the loops. 

We conclude that since the Frenet framing of IIP35 is determined entirely by the backbone geometry so are the 
orientations of the amino acids, with a surprisingly good accuracy. 

In the general case, we have inspected the correlation between the framing and the Frenet framing by performing 
a statistical analysis of the directional distribution of the c- vectors in the backbone Frenet frames for all amino acids in 
PDB. Our results are summarized by Figure 10 where we display the statistical distribution of the angles {cpi^ Xi) that 
we have deflned in Figure 8. We have used the PDB deflnition to identify the three structures we display separately 
(a-helix, /3-strand, loop) but we note that there are sometimes ambiguities in determining whether a particular amino 
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FIG. 9: The nutation in the direction of the vectors Ci in the Frenet frame along lYRF backbone. The blue dots are the Cc 
carbons in the helices, and the red dots are the Ca carbons that are located in the loops. 




FIG. 10: Kent plots of the carbon vectors c for all sites of all proteins presently in PDB, with color intensity proportional 
to the number of vectors. For a-helices (left), the direction of c nutates very little around the direction (x, ^) ~ (1-84, —2.20). 
For /3-strands (right) the nutation is somewhat more spread, but still very clearly concentrated around (x^^) ~ (1-96, —2.47). 
Finally, for loops (center) we observe the formation of a narrow arc that connects the a and /3 directions. 

acid belongs to a a-helix, /3-strand or a loop in particular when the amino acid is located in the vicinity of the border 
between these three classes. 

We find that the observation we have made in the case of HP35 persists: The orientations of the C/3 carbons in 
the Frenet frames are quite inert and essentially protein and amino acid independent. There is only a slight nutation 
around the statistical average value. Furthermore, the directions for the a- helices and /3-strands are also almost the 
same, the difference in the statistical average is surprisingly small but nevertheless noticeable. In the case of loops, 
we find that the statistical distribution of the vector c in the Frenet frame displays a thin band that connects the 
a-helices and /3-strands. This universality is somewhat unexpected, since only a small proportion of the loops connect 
an a-helix with a /3-strand. 

The overlapping regions between the three different classes in the Kent plots of Figure 10 can be at least partly 
explained by the uncertainty in classifying amino acids in the vicinity of the border regions. We expect that a careful 
scrutiny of the class assignments of these amino acids will sharpen our statistical results. Alternatively, our approach 
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could be developed into a technique to determine a more definite classification of those amino acids that are located 
in the border regions separating the a-helices, /3-strands and the loops from each other. But even at this level of 
classifying the amino acids the results of our analysis imply that almost independently of the protein, when we traverse 
its backbone by orienting the camera gaze direction so that it remains fixed in the Frenet frames, the directions of 
the Cj3 carbons are subject to only small nutations. 

In Figure 11 we display the histograms for the components of the Cjs vectors in terms of the x ^ angles 
defined in Figure 8. These histograms confirm that the directional variations of the are surprisingly inert. 




FIG. 11: Frenet frame histogram of the distribution of [Xi^) angles displayed in Figure 10. for all in the PDB. The 
histogram shows how the directions are subject to only very small deviations around their average values. 

Finally, we have found that in Bishop's parallel transport frame the direction of the carbon does not lead to 
such a regular structure formation as in the Frenet frame; See Figure 12 where we plot the statistical distributions of 
the vectors c in the Bishop's frames. 




FIG. 12: The same as in Figure 10, but for all proteins in PDB using Bishop's parallel transport frame. In this frame the 
directions of the Cj3 carbons are distributed in a longitudinally uniform manner inside a segment of the Kent sphere. 



V. V: CONCLUSIONS 



We have scrutinized the problem of frame determination along piecewise linear discrete curves, including those with 
inflection points. Our approach is based on the transfer matrix method that has been previously applied extensively 
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to investigate discrete integrable systems and lattice field theories. The introduction of a transfer matrix enables us to 
describe a framing in a covariant manner, with different frames related to each other by SO {2) gauge transformations 
that correspond to rotations in the normal planes of the curve. In particular our construction is not based on, and 
does not involve, any discretization of a continuum equation. Consequently we can effortlessly describe curves that 
become fractals in the limit where the lattice spacing a.k.a. the length of line segments vanishes. But we have also 
verified that if the continuum limit exists as a class different iable curve, we arrive at the generalized version of 
the continuum Frenet equation. Furthermore, the manifest covariance of our formalism under frame rotations enables 
us to investigate the framing of a physically determined discrete curve in a manner where the framing is based on, 
and captures the properties of the underlying physical system. Consequently we expect that our formalism has wide 
applications to the visualization of discrete curves and the determination of camera gaze positions in a variety of 
scenarios. 

One notable outcome of our analysis is the identification of inflection points with the centers of loops, and the 
interpretation of loops as kink-solitons. In [28], [29j we have already applied this identification to develop an Ansatz 
based on ([T]), to succesfully describe the native folds of PDB proteins in terms of elementary functions. 

As an example we have investigated the framing of folded proteins in the Protein Data Bank. In this case no 
valable continuum description exist, due to the fact that the universality class of folded proteins is characterized by 
the presence of a nontrivial Hausdorff dimension. Consequently any framing of folded proteins should be inherently 
discrete. In order to introduce a framing that directly relates to the physical properties of a folded protein, we 
have employed the relative orientation of the C(3 carbons in the amino acids with respect to the ensuing backbone 
central Ca carbons. We have statistically analyzed the relative orientation of these C(3 frames to the geometrically 
determined Frenet frames of the PDB proteins. We have found that the two framings are almost identical, they differ 
from each other only by a practically amino acid independent global frame rotation: For the a-helices the nutation in 
the orientation of the carbons in the Frenet frame is very sharply concentrated around its statistically determined 
average direction. For /3-strands the result is very similar, with only a relatively small increase in the amplitude of 
nutations. Finally, in the case of loops we find that the orientation of the C/3 carbons oscillates along a narrow circular 
arc that connects the a-helices and /3-carbons. In each case we have used the definition employed in the Protein Data 
Bank to identify the helix/loop class of the amino acid, and we note that the existing criteria for determining this 
class in the case of an amino acid that is located in the vicinity of the terminals of each structure is subject to 
interpretations. Consequently we propose that there are several border line cases that interfere destructively with 
the accuracy of our statistically determined results. We hope that our framing technique will eventually provide 
a refinement of the existing classification principles. The biophysical interpretation and biological relevance of our 
observations will be reported elsewhere. 
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